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SUMMARY  OF  RESEARCH 


The  overall  goal  of  the  present  research  is  to  construct  a  safe, 
effective  human  anthrax  vaccine  using  recombinant  DNA  techniques.  These 
studies  are  broken  down  into  three  phases: 

Phase  I.  Isolation  and  characterization  of  the  Bacillus  anthraci.s  toxin 
genes  for  protective  antigen  (PA) ,  lethal  factor  (LF)  and  edema  factor 
(EF) .  The  individual  toxin  genes  will  be  cloned  in  expression  vectors  for 
large  scale  production  of  toxin  proteins  using  E.  coli  and  B.  subtilis . 
These  experiments  should  provide,  enhanced  production  of  the  different 
toxin  components  which  are  made  in  low  levels  in  E.  coli. 

Phase  II.  Generation  of  mutant  toxin  proteins  from  cloned  toxin  genes 
defined  in  Phase  I.  Mutations  derived  from  deletion  analysis  or  site-specific 
mutagenesis  of  the  cloned  toxin  genes  will  be  generated  using  in  vitro 
manipulations  of  the  recombinant  plasmid  DNAs.  Mutations  of  potential  use 
for  vaccine  construction  will  be  Identified  as  those  which  are  non-toxic 
but.  still  iinmunologically  active  and  protective. 

Phase  III.  Insertion  of  mutant  genes  back  into  B.  anthracis  with  the 
selective  removal  of  wild-type  genes.  Then,  testing  of  these  mutant  strains 
will  be  performed  in  animals,  such  as  the  mouse  or  guinea  pig.  The  research 
outlined  in  this  annual  report  describes  the  cloning  and  characterization 
of  the  individual  B.  anthracis  toxin  genes.  These  genes  are  being  expressed 
in  B.  subtilis  and  E.  coli  and  are  being  specifically  mutated  to  generate 
mutant  derivatives  which  lack  biochemical  activity  but  maintain  immunological 
properties.  In  addition,  a  physical  characterization  of  the  B.  anthracis 
plasmids  with  regard  to  size,  genetic  complexity,  GC%  and  restriction 
enzyme  mapping  is  also  described. 
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FOREWORD 


The  investigators  (Principal  Investigator  and  Graduate  Students)  have 
abided  by  the  National  Institutes  of  Health  Guidelines  for  Research  Involving 
Recombinant  DNA  Molecules  (May,  1986)  Supplemental  guidelines  pertaining 
to  the  subcloning  of  the  individual  B.  anthracis  toxin  genes  in  sporulatlon 
competent  B.  subcilis  was  approved  by  the  NIH  committee  on  toxins  March 
13,  1986.  All  recombinant  DNA  research  has  also  been  registered  with  and 
approved  by  the  Brigham  Young  University  Institutional  Biosafety  Committee. 
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BACKGROUND 


As  discussed  in  the  summary ,  the  goal  of  the  experiments  performed  in 
this  laboratory  is  to  develop  a  more  effective  human  anthrax  vaccine  for 
the  protection  of  U.S.  Army  troops  using  recombinant  DNA  techniques.  The 
current  human  anthrax  vaccine  consists  of  alum-precipitated  supernatant 
material  from  fermenter  cultures  of  B.  anthracis  which  consists  predominantly 
of  PA  (protective  antigen)  (1).  Unfortunately,  this  vaccine  may  not  be 
effective  against  all  strains  of  B.  anthracis  since  several  virulent  strains 
have  been  classified  as  "vaccine  resistant"  with  regard  to  this  human 
vaccine  (2).  Clearly,  an  effective  vaccine  must  afford  immunological 
protection  against  all  strains  of  B.  anthracis  and  against  all  forms  of 
infection,  including  aerosol. 

Virulent  strains  of  B.  anthracis  contain  two  different  plasmids.  The 
toxin  plasmid  (pXOl)  is  necessary  for  expression  of  the  three  toxin  proteins 
(3,4)  and  the  capsule  plasmid  (pX02)  (5,6)  is  necessary  for  production  of 
the  poly -D- glutamic  acid  capsule  (5,7).  In  order  to  be  able  to  insert 
mutant  toxin  genes  back  into  B.  anthracis  for  the  production  of  a  safe 
vaccine  strain  it  has  been  necessary  to  characterize  these  plasmids. 
Studies  designed  to  physically  characterize  these  plasmids  have  included 
buoyant  density  centrifugation,  DNA  melting  analysis  and  restriction 
endonuclease  mapping  of  these  DNAs.  These  characterizations  should  be 
helpful  in  generating  reo.ombinant  vaccine  strains  of  B.  anthracis  and  in 
understanding  the  physical  organization  of  these  DNAs. 

Each  of  the  anthrax  toxin  genes  are  cloned  (4,8,9).  The  PA  and  F.F  genes 
have  been  sequenced  (13,15).  Experiments  which  are  aimed  at  expressing  these 
toxin  genes  in  large  quantities  in  E.  coli ,  IS.  snbtilis  and  B.  anthracis 
are  in  progress.  In  addition,  we  arc  mutating  the  different  toxin  genes 


4 


< 


in  order  to  generate  mutant  toxin  proteins  which  are  still  immunogenic  but 
biochemically  non- functional  to  be  used  in  vaccine  development. 


RESULTS 

Restriction  mans  of  oXOl  and  pX02 .  The  restriction  maps  for  pXOl  and 
pX02  (see  Figures  1  and  2)  are  essentially  completed  for  enzymes  which 
cleave  a  few  times,  such  as  Ps  tl ,  BaiiiHI ,  Cla  I,  Ss  cl,  Bglll  and  PvtiII. 
Experiments  to  map  the  more  frequent  cutting  enzymes ,  such  as  £coRI  and 
Hindlll,  are  presently  being  completed.  We  have  generated  recombinant  DNA 
libraries  for  pXOl  and  pX02  in  bacteriophage  X  as  well  as  in  plasmids  in 
order  to  generate  a  complete  map  for  the  most  commonly  used  restriction 
enzymes.  /.  detailed  restriction  enzyme  map  of  the  LF  and  PA  gene  regions 
on  pXOl  Is  shown  in  Figure  3. 

In  a  final  effort  to  map  pXOl  and  pX02 ,  we  are  identifying  the  number 
and  location  of  the  different  RNA  transcripts  from  these  plasmids.  This 
research  project  involves  the  identification  of  the  different  promoters 
and  the  RNAs  made  from  them.  Basically,  we  are  cleaving  pXOl  and  pX02 
with  an  enzyme  which  cleaves  these  DNAs  many  times,  such  as  Mho I  or  Sa u3A, 
generating  DNA  fragments  which  can  ligate  to  BamHI  cleaved  plasmids. 
Using  B.  subtilis  plasmids  which  have  been  cleaved  with  Ba/iiHI  located  prior 
to  a  promoterless  chloramphenicol  resistance  gene  (10),  we  will  insert  the 
pXOl  or  pX02  DNA  fragments  into  these  promoter  identification  plasmids. 
After  transformation  of  these  recombinant  plasmids  into  B.  subtilis ,  we 
will  Identify  bacteria  which  are  now  resistant  to  chloramphenicol.  These 
plasmids  will  contain  a  functional  promoter  (derived  from  pXOl  or  pX02) 
driving  the  transcription  of  Che  chloramphenicol  resistance  gene.  The 


recombinant  DNA  inserts  prepared  from  these  promoter  expression  plasmids 


will  than  be  mapped  on  pXOl  or  pX02.  The  size  and  direction  of  RNA 
transcription  will  also  be  determined.  This  procedure  is  very  powerful, 
and  should  allow  us  to  identify  and  position  most,  if  not  all,  of  the 
functional  promoters  from  the  B.  anthvacis  plasmids,  assuming  that  all 
these  promoters  will  also  function  in  B.  subcilis.  However,  with  the 
recent  discovery  that  we  can  transform  B.  anthvacis  using  electroporation, 
we  will  also  be  able  to  transfer  these  promoter  plasmids  to  B.  anthvacis 
for  promoter  identification  directly  in  the  parent  organism. 

Characterization  of  the,  edema  factor  gene  (cvn).  The  edema  factor  is 
a  calmodulin-dependent  adenylate  cyclase  (11,12).  We  have  successfully 
cloned  and  sequenced  the  EF  gene  (cya)  and  the  DNA  sequence  (13)  was  reported 
in  the  previous  annual  report.  A  paper  describing  the  cloning  and  expression 
of  EF  in  E,  coll  has  been  published  (9)  and  a  manuscript  describing  the 
DNA  sequence  and  its  deduced  amino  acid  sequence  has  been  submitted  and 
should  soon  be  accepted  by  Gene  (13).  We  have  included  the  complete  EF 
amino  acid  sequence,  deduced  from  its  DNA  sequence,  in  Appendix  I. 

Several  interesting  structural  features  for  EF  are  part  of  its  deduced 
amino  acid  sequence.  (1'  F.F  apparently  contains  a  33  amino  acid  sigi'.r.  i 
peptide  which  conforms  to  known  Bacillus  leader  sequences  in  that  it  starts 
with  charged  (mostly  positive)  and  hydrophilic  residues  (amino  acids  1-10) , 
followed  by  a  central  core  of  hydrophobic  nmino  acids  (residues  11-23)  and 
then  several  hydrophilic  residues  (amino  acids  24-33)  prior  to  che  start  of 
the  mature  protein.  Proteolytic  cleavage  apparently  occurs  at  an  Ala-Met 
peptide  bond,  near  the  start  of  a  proposed  a-heli.-:  (see  Figure  4A)  ,  consistent 
with  signal  processing  after  an  Ala  or  Gly  in  bacilli  (14).  A  29  amino 
acid  leader  sequence  was  also  found  for  PA  (13)  which  would  likely  contain 
a  similar  secondary  structure  (shown  in  Figure  3A) .  Likewise,  a  signal 
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peptide  of  33  amino  acids  would  be  present;  i *1  the  t.K* precursor  laoLuculc 
(Figure  3A) .  Figure  Alt  shows  a  comparison  botwoon  the  amino  acid  sequences 
near  tho  ends  of  the  EF,  PA  and  LF  signal  peptides  and  the  apparent  position 
of  proteolytic  cleavage.  Similar  amino  acids  at  the  end  of  tho  signal 
peptide  may  be  requirod  for  signal  peptidase  recognition  or  for  secretion. 
(H)  A  very  strong  Bacillus  ribosome  binding  site  immediately  upstream 
from  the  start  of  the  EF  protein  coding  region  is  present.  (AAAGGAGGT)  which 
is  similar  to  the  PA  and  LF  ribosome  binding  sices  (both  of  those  gones 
have  a  ribosomo  binding  site  sequence  'f  AAAGGAG) .  (iii)  Amino  acid 
residues  3A7  to  355  of  the  EF-precursor  protein  contains  tho  sequence  Gly- 
x-x-x-x-Gly-Lys-Sor  (where  x«*nny  amino  acid)  which  is  a  perfect  match  to  a 
consensus  sequence  present  in  prokaryotic  and  eukaryotic  ATP  and  GTP  binding 
proteins  (16).  The  Lys  residue  is  part  of  the  ATP  binding  sites  of  these 
proteins  ,»nd  appears  to  bo  part  of  the  F.F  ATP  binding  site  as  well.  That 
is,  using  site-specific  mutagenesis  procedures,  we  have  replaced  this  Lys 
within  EF  with  an  Asn  and  cyclase  activity  was  reduced  90-95%  (unpublished 
data  of  author),  (iv)  We  hove  also  Identified  a  domain  in  EF  which  could 
represent  its  putative  calmodulin-binding  site.  As  described  in  the  EF 
sequencing  paper  (13),  calmodulin -binding  proteins  often  contain  an  «- 
helical  region  with  charged  or  hydrophilic  residues  on  one  side  and  hydrophobic 
residues  on  the  other,  Such  an  amphiphilic  helical  region  is  present  in  EF 
located  between  amino  acid  residues  313-323  of  the  EF-procursor  (see  Appendix 
IV).  Interestingly,  this  putative  calmodulin-binding  site  is  conserved  in 
the  B.  pertussis  adanylatc  cyclase  as  well  (17,18).  (v)  No  homology 
between  the  EF  gene  or  its  deduced  EF  amino  acid  sequence  was  observed  with 
either  the  E.  coli  or  yeast  adenylate  cyclases.  However,  there  is  at  least 
throe  regions  of  homology  in  the  amino  acid  sequence  between  F.F  and  the  B. 


percuss  is  calmodulin- dependent  adenylate  cyclase.  A  section  describing 
this  homology  is  including  below. 

Charactoriicatio'-  of_tl)c  IF  pc  no  (loC).  Wo  have  do  term  l  nod  the  entire 
DhA  sequence  for  the  B.  anthracis  LF  gone  (ief) .  Wo  easily  identified  the 
start  of  the  LF  gene  since  the  first  15  amino  acids  of  the  mature  LF  was 
previously  determined  by  Dr.  J.  Schmidt  (USAMRIID) .  The  LF  DNA  sequence 
and  the  deduced  amino  acid  sequonco  are  shown  in  Appendix  11.  The  LF  gene 
contains  a  good  ribosome  binding  site  (AAAGGAG)  which  is  identical  to  the 
proposed  PA  gone  ribosome  binding  site.  The  LF-precursor  apparently  contains 
a  33  amino  acid  signal  sequonco  (see  Figure  4A)  which  is  removed  during 
secretion.  This  signal  sequonco  conforms  to  consensus  Bacillus  loader 
peptides  (and  to  the  EF  and  PA  signal  peptides)  in  that  it  starts  with  a 
polar  or  charged  region  followed  by  23  non-polar,  hydrophobic  amino  acid 
residues.  After  this  33  amino  acid  leader  peptide,  the  next  16  amino  acids 
correspond  exactly  to  the  LF  amino  acid  sequonco  determined  by  Dr.  Jim 
Schmidt  (USAMRIID),  except  for  one  amino  ocLd.  Aroi«K>  acid  position  +10  of 
the  mature  protein  (+63  of  LF-procursor)  Is  a  His  (based  on  the  DNA  sequence) 
whereas  it  was  previously  reported  to  be  u  Lys  (based  on  LF  protein  sequencing)  . 
Interestingly,  there  is  a  singlo  Cys  in  tho  LF  loader,  although  no  Cys 
residues  are  in  tho  mature  protein.  Tho  entire  protein  sequence  of  LF  is 
also  shown  In  Appendix  III.  Lastly,  there  appears  to  bo  extensive  amino 
acid  homology  between  LF  and  EF  in  the  first  300  amino  acids  of  those 
proteins.  We  have  detected  10  closely  elated  domains  and  three  of  these 
highly  conserved  domains  arc  underlined  (and  labelled  #1  ,  it?  and  it 3)  In 
Appendix  I  and  Appendix  Ill.  Thoso  homologous  regions  could  represent  I’A 
binding  domains.  Since  most:  of  these  domains  are  highly  charged,  interactions 
with  PA  may  occur  through  a  series  of  electrostatic  interactions. 


Transcription  start  sites  Cor  the  anthrax  toxin  rones .  I ) s i ng  rad  1  o I nbe l c d 
oligonucleotides  specific  for  each  of  the  difforont  toxin  genes,  we  have 
attempted  to  determine  die  start  site  for  transcription.  Using  111RNA  (isolated 
from  B.  antrfiracis  Sterne)  as  template,  each  oligonucleotide  was  used  to 
prime  DNA  synthesis  (using  reverse  transcriptase)  towards  the  5' -end  of  tho 
respective  toxin  mllNA.  This  newly  synthesized  radioactive  DNA  was  denatured 
and  clcctrophorcsed  on  a  denaturing  polyacrylamide  gel.  Using  this  approach, 
we  have  successfully  identified  the  start  sites  for  PA  and  LF  gene 
transcription.  The  PA  promoter  is  apparently  located  immediately  upstream 
from  the  start  of  its  coding  regLon  with  transcription  starting  about  25 
bases  before  tho  first  start  codon  for  PA  translation  (15).  Likowise,  the 
apparent  start  for  Llr  gone  transcription  occurs  25  bases  prior  to  the  ATG 
start  codon  for  LF  translation  (about  nucleotide  456  in  Appendix  II).  Wo 
have  not  yet  been  able  to  localize  EF  gene  transcription.  This  failure  is 
probably  duo  to  tho  low  lovoi  of  KF  mUNA  produced  in  B.  unthracis  which  is 
at  least.  10- fold  lower  than  either  the  PA  or  LF  mRNA  concentrations  (unpublished 
data  of  author). 

Si  to- specific  mutarenesis  of  the  KF  gone.  Using  site -specif  i.c  mutagenesis 
procedures,  we  have  altered  the  EF  gone  in  order  to  modify  its  onzymo  activity 
and  to  construct  F.F  expression  vectors.  First,  the  previously  identified 
ATP  binding  domain  in  EF,  which  conforms  to  tho  consensus  ATP  binding  site 
of  other  prokaryotic  and  eukaryotic  ATP  and  GTP  binding  proteins  (16) ,  has 
a  Lys  residue  which  has  been  shown  to  be  involved  in  ATP  binding,  was 
changed  to  an  Asn  in  EF.  The  EF  adenylate  cyclase  activity  of  this  mutant, 
isolated  from  E.  coll,  was  reduced  about  90-95%  indicating  that  this  Lys  is 
probably  involved  in  ATP  binding.  However,  since  total  activity  was  not 
abolished,  other  residues  arc  probably  also  involved.  Of  particular  interest, 


>) 


is  the  presence  of  a  His  two  residues  prior  to  this  Lys .  This  His  is  also 
conserved  in  the  B.  pertussis  adenylate  cyclase  as  discussed  below  (see 
also  Appendix  IV) . 

We  have  also  removed  the  Bgl IT.  cleavage  site  within  the  EF  gene  and 
inserted  a  new  Bgl II  recognition  site  immediately  prior  to  the  start  of 
the  protein  coding  sequence.  In  another  experiment,  we  inserted  a  Bgl II 
cleavage  site  immediately  downstream  from  the  PA  promoter  so  that  we  could 
fuse  the  PA  promoter  to  the  EF  gene.  This  hybrid  toxin  gene,  when  inserted 
into  pBS42  (19)  and  transformed  into  B.  subtilis ,  expressed  EF  at  a  level 
at  least  as  great  as  B.  anthracis  Sterne.  We  are  in  the  process  of  determining 
the  precise  amount  produced  using  an  ELISA  or  Western  blot.  EF  was  secreted 
from  B.  subtilis  and  was  enzymatically  active  in  an  adenylate  cyclase 
assay.  Since  PA  expression  is  regulated  by  bicarbonate  (20)  in  B.  anthracis 
(Dr.  J.  Bartkus,  USAMRIID,  personal  communication),  we  are  attempting  to 
transfer  this  PA  promoter- EF  gene  plasmid  into  B.  anthracis  by  electroporation. 
Hopefully,  this  plasmid,  when  introduced  into  B.  anthracis  will  produce 
regulated  high  levels  of  EF  for  purification  and  analysis.  EF  gene  mutants 
can  also  be  generated  and  transferred  to  B.  anthracis  using  this  plasmid 
construction. 

Expression  of  toxin  genes  in  B.  subtilis  and  B.  anthracis .  In  an 
effort  to  express  the  toxin  genes  in  B.  subtilis  for  secretion,  we  cloned 
each  of  the  genes  into  B.  subtilis  plasmids.  Initially,  we  expressed  these 
genes  by  cloning  them  to  a  regulated  promoter  (in  plasmid  pSI-1)  which  also 
contains  a  strong  ribosome  binding  site  (21).  For  these  constructions,  we 
introduced  unique  Xbal  cleavage  sites  prior  to  the  start  codons  for  the  PA, 
EF  and  LF  genes.  Following  cleavage  with  Xbal  (which  does  not  cleave 
within  either  the  EF  or  PA  genes)  ,  the  entire  toxin  gene  was  ligated  into 
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plasmid  pSI-1.  When  transformed  into  B.  subtilis,  transcription  of  the 
inserted  toxin  genes  was  regulated  by  the  lac  repressor  and  IPTG  (19,21). 
Fjr  these  hybrid  genes,  the  amount  of  PA  produced  was  close  to  the  amount 
produced  by  PA1  (22;  unpublished  data  of  author). 

We  also  created  a  toxin  expression  plasmid  using  the  T7  promoter 
cloned  upstream  from  the  toxin  gene.  In  order  to  get  expression  in  B. 
subtilis ,  we  introduced  into  B.  subtilis  a  cloned  copy  of  the  T7  RNA  polymerase 
gene  (23).  These  bacteria  contain  the  T7  polymerase  as  part  of  an  integration 
plasmid  for  regulated  expression  since  the  T7  gene  was  inserted  into  the 
regulatable  promoter  site  of  pSI-1.  In  order  to  select  for  cells  containing 
this  polymerase,  we  also  included  the  erythomycin  resistence  gene  from 
pE194,  prior  to  integration  into  B.  subtilis  genomic  DNA  (24,25).  B.  subtilis 
containing  this  integration  plans id  should  express  T7  RNA  polymerase  after 
the  addition  of  IPTG.  These  cells  will  then  be  transformed  with  a  replication 
competent  plasmid  containing  one  of  the  B.  anthracis  toxin  genes  (e.g., 
cya,  pag,  or  lef )  cloned  downstream  from  the  T7  promoter  for  gene  expression. 
Although  we  have  not  yet  tested  these  recombinant  B.  subtilis,  these  plasmid 
constructions  express  toxin  in  £.  coli  using  the  T7  polymerase.  B.  subtilis 
containing  these  plasmids  should  produce  high  level,  regulated  expression 
of  the  toxin  genes  in  a  safe  bacterial  host.  Toxin  protein  is  secreted  and 
can  be  used  for  purification  of  individual  toxin  components. 

Relationships  between  EF  and  the  pertussis  adenylate  cyclase.  Bordetella 
pertussis ,  the  causative  agent  of  whopping  cough,  secretes,  among  other 
virulence  factors,  a  calmodulin- dependent  adenylate  cyclase.  The  adenylate 
cyclase  appears  to  function  independently  of  the  pertussis  toxin,  but  is  a 
required  virulence  factor  since  strains  which  lack  cyclase  activity  are 
avirulent  (26).  Glaser  at  al .  (18)  recently  showed  that  the  cyclase  catalytic 
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domain  is  about  450  amino  acids  in  length  and  is  part  of  a  larger  precursor 
polypeptide  of  1706  amino  acids.  With  the  anticipation  that  EF  and  the 
pertussis  cyclase  might  be  related,  we  performed  a  homology  search  between 
the  entire  EF  (800  amino  acids)  and  pertussis  cyclase  translational  products 
(1706  amino  acids).  Three  major  regions  of  homology  (labeled  #1,  #2  and  #3 
in  Appendix  IV)  were  observed  which  included  the  catalytic  domain  of  the 
pertussis  cyclase  and  the  carboxyl  terminal  500  amino  acids  of  EF.  This 
homology  comparison  is  shown  in  Appendix  IV.  Domain  #1  contains  the  consensus 
ATP  binding  site  which  is  surrounded  by  highly  conserved  amino  acids.  This 
high  degree  of  amino  acid  conservation  indicates  a  close  evolutionary 
relatedness  for  these  two  proteins.  The  putative  calmodulin-binding  site 
is  conserved  for  these  proteins  and  is  shown  in  Appendix  IV. 

Restriction  endonuclease  cleavage  maps  for  the  anthrax  toxin  genes. 
Using  the  DNA  sequences  for  the  EF,  PA  and  LF  toxin  genes,  we  have  generated 
a  set  of  restriction  endonuclease  cleavage  maps  for  these  genes.  These 
are  shown  in  Appendices  V,  VI  and  VII.  These  maps  should  be  helpful  to 
those  researchers  using  DNA  containing  these  genes. 
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CONCLUSIONS 


It  appears  from  the  data  reported  here  that  Phase  I,  II  and  III  of 
the  original  research  proposal  are  essentially  completed.  Each  of  the 
anthrax  toxin  genes  has  been  cloned  and  expressed  in  E.  coli  and  to  some 
extent  in  B .  subcilis  and  B.  anthracis .  Since  we  have  cloned  each  of  the 
toxin  genes  and  know  their  DNA  sequences ,  we  will  be  able  to  continue  to 
study  gene  expression  and  to  characterize  the  toxin  proteins  better.  We 
will  be  able  to  generate  toxin  gene  mutants  for  the  construction  of  a  safe 
vaccine  and  to  elucidate  the  biochemical  activities  of  these  proteins. 
With  the  exception  of  putting  the  mutant  genes  back  into  B.  anthracis ,  our 
research  will  allow  us  to  construct  a  safe  recombinant  DNA  derived  anthrax 
vacc ine . 
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FIGURE  1 .  Restriction  map  of  pXOl.  The  positions  of  the  LF,  PA  anti 
EF  genes  are  depicted.  The  sizes  of  DNA  fragments  for  each  enzyme  are  not 
included  due  to  the  lack  of  space. 
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FIGURE  3,  Restriction  map  of  the  PA  and  LF  gene  regions  on  pXOI. 
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(A)  The  signal  peptides  (in  bold)  for  EF,  PA  and  LF  are  shown.  The  \ 

proposed  secondary  structure  most  likely  to  be  assumed  for  the 
first  60  amino  acids  of  each  protein  is  shown  (ct-a-helix;  /3»/3-  " 

sheet;  t-/?-turn;  blank"»random  coil).  The  amino  terminal  amino 
acid,  as  determined  by  Dr.  J.  Schmidt  (USAMRIID) ,  for  each  mature 
toxin  protein  is  also  shown.  } 


EF  signal  peptide 

4 -start  of  mature  EF 

1  MTRNKF I PNKFS I IS  FS VLL  FAISSSQAIEVNAMNEHYTE  SDIKRNHKTEKNKTEKEKFK  60 

aattt  ttfiPPPPPPPfiP  aaa  aaaaaaaaaaaaact  aaaaaaaaaaaaaaaaaaaa 

PA  signal  peptide 

4 -start  of  mature  PA 

1  MKKRKVLIPLMALSTILVSS  TGNLEVTQAEVKQENRLLNE  SESSSQGLLGYYFSDLNFQA  60 

aaaaaappppaaappppp  aaaaaaaaaaaaa  ttttttpppppfipt  acta 

LF  signal  peptide 

4-start  of  mature  LF 

i  MNIKKEFIKVISMSCLVTAI  TLSGPVFIPLVQGAGGHGDV  GMHVKEKEKNKDENKRKDEE  60 

aaaaaaaaaaaaa tpppppp  p  t  tpppppp  a  aaaaaaaaaaaaaaaaaaaa 


M 


4 


< 


(B)  The  amino  acid  sequence  at  the  end  of  the  anthrax  toxin  signal  _• 

peptides  Is  shown.  Cleavage  occurs  after  Ala  or  Gly,  consistent 
with  known  cleavages  after  bacilli  signal  peptides  (14).  Similar 
amino  acids  at  the  end  of  the  signal  peptides  (denoted  with  a 
vertical  bar  [|])  probably  represents  signal  peptidase  recognition 
sequences.  The  numbers  (-1  or  +1)  indicate  the  last  amino  acid 
of  the  signal  peptide  and  the  "irst  amino  acid  of  the  mature  t 

toxin  protein,  respectively. 


EF  signal  peptide 
PA  signal  peptide 
LF  signal  peptide 


- 1  +1. 

Glu-Val-Asn-Ala- -Met 

I  I  I 

Val -lie- Gin-Ala - -Glu 

I  I  I  I 

Leu-Vai-Gln-Gly--Ala 


4 


FIGURE.  4 .  Anthrax  toxin  signal  peptides. 
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APPENDIX  I.  EF  amino  acid  sequence 

(33  aa  signal  peptide)  4 -State  oi  mature  EF  (767  aa) 

1  Ml'RNKFIPNKFSIISFSVLLFAISSSQAXEVNAJlNEHYTESniKRNUKTEKNKTEKEKFKDSINNLVKTE 

71  FTNETLDKIQQTQDLUmPKDVLEIYSELGGEIYFTDIDLVEHKELQDLSEEiiKNSMNSRGEKVPFASR 

141  FVFF.KKRETPKLIINIKDYAlNSROSKRVyVEIGKGI  SI.DTTSKDKSLDPEFLNLIKSLSDDSDSSDLLF 

#1  #2 

211  SQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDimTVLELYAPDMFRYMNKLEKGGFF.KIS 

#3 

281  ESLKKEGVEKDRIDVLKGEKAIKASGLVPEHADAFKKIARELNTYILFRPVNKLATNL1KSGVATKGLNE 

(Potential  calmodulin  binding  site) 

351  HCKSSDWGPVAGYIPFDQDLSKKHGQQLAVEKGNLENKKSITEHEGEIGKIPLKLDHLRIEEUCENGIIL 
(Putative  ATP  binding  site) 

421  KGKKEIDNGKKYYLLESNNQVYEFRISDENNEVQYKTKEGKITVLGEKFNWRNIEVMAKNVEGVLKPLTA 
491  DYDLFALAPSLTEIKKQIPTKRMDKVVNTPNSLEKQKGVTNLLIKYGIERKPDSTKGTLSNWQKQMLDRL 
561  NEAVKYTGYTGGDVVNHCTEQDNEEFPEKDNE1FIINPEGEFILTKNWEMTGRFIEKNITGKDYLYYFNR 
631  SYNKIAPGNKAYIEWTDPITKAKINT1PTSAEFXKNLSSIRRSSNVGVYKDSGDKDEFAKKESVKKIAGY 
701  LSDYYNSANHI FSQEKKRKI S I FRG IQAYNE1 KNVUCSKQ1 APEYKNYFQYLKER1TNQVQLLLTHQKSN 
771  I  F.FKLT.YKQI.NFTF.NF.TDNFEVFQKI  IDEK 


************* 


The  sequence  contains  800  amino  acids  (Mr  92,464): 


Ala  (A) 

32 

Leu 

(L) 

69 

Arg  (U) 

22 

Lys 

(K) 

103 

Asn  (N) 

61 

Mot 

(M) 

9 

■ 

Asp  (D) 

44 

Phe 

(F) 

40 

Cys  (C) 

0 

Pro 

(P) 

23 

Gin  (Q) 

27 

Set 

(S) 

55 

Glu  (E) 

82 

Thr 

(T) 

39 

Gly  (G) 

40 

Trp 

(W) 

5 

His  (H) 

13 

Tyr 

(V) 

34 

i 

He  (I) 

68 

Vnl 

(V) 

34 

Acidic 

(Asp 

+  Glu) 

126 

Basic 

(Arg 

+  hys) 

125 

Aromatic 

(Phe 

+  Trp  +  Tyr) 

79 

Hydrophobic 

(Aromatic  t  Ilo  »■  Leu  i 

Me  t  i 

i  Val) 

259 

4 
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APPENDIX  II: 


Nuclooti.de  Sequence  of  the  LF  gone. 


10  20  30  AO  50  00  70  80  90 

AAA'rrAGGA'ITrCGGTrAIXylTrACTAlTiTnTAAAATAATAGTATrAMTAG'IGGMIXjGAAATGATAAA'ItIGGCnTAAACAAAACL' 

100  110  120  130  140  150  160  170  180 

AA'l'GAAATAATCrAGAAATGGAA'lTrCl’CGAGTiTi'AGA’rrAAACOATAGGAAAAAAA'lXIACACTGTCAAGAAAAATGATAGAATCCC’i’A 

190  200  210  220  230  240  250  260  270 

GACrAATl'AAGA’l'AACCAAA'nXXn’AG'iTATAGGl'AGAAACl’rArrrATrrCTATAATAGCATGCAAAAAAGTAAATATrCTGrrCCATA 

280  290  300  310  320  330  3A0  350  360 

Cl'AlTlTAGTAAAllA'riTAGCAAGTAAAlT.TKXn'GTATAAACAAAGlTrATCl'rAATATAAAAAATTACTTrACTITrATACAGATTA 

370  380  390  400  410  420  430  440  450 

AAATGAAAMTITITrATX^GAAGAAATATrcCCTITAATTrA'IGAGGAAATAACn.'AAMlTriXn.'AGATAC'rrTATnTATTG'rrGAAA 

460  470  400  490  500  510  520  530  540 

'lG'nCAClTATAAAAAAGGAGAGA'lTAj\ATATGAATATAAAAAAAGAAHTATAAAAGl'AAlTAGrATGTCA'IX7IT£AGrAAGAGCMTr 
(r .  b .  s . )  MeitAsnlleLysiysGluPDelleLysVslIleScrMetSerCysLeuValThrAlalle 

(33  amino  acid  signal  peptide) 

550  560  570  580  590  600  610  620  630 

AC’rn.X^GlCCr(Xl(Xn^CTTTATC(XCCTTGTACAGGGGGOGGGOGGl‘CATGGTGA'rGTAGGTATGGACGTAAAAGAGAAAGAGAAAAAT 
ThrLeuScrGlyProValPhel  leProLcuVa  1G  lnGlvAlftGlvGlvHisQlvAspVnlGlvl.at.HlsVnlLvsGlu1.vsGluLvsAsn 

+1  of  mature  LF 

640  650  660  670  680  690  700  710  720 

AAAGATGAGAATAAGAGAAAAGATGAAGAAOGAAATAAAACAGAGGAAGAGCArrTAAAGGAAATCATGAAACACATrGTAAAAATAGAA 
l,ysAspGluAsnLysArgLysAspGluGlv»ArgAsnLysnirGln(GluGluH?.sLeuI,ysGluIleMetLysHisIleValLysIloGlu 

730  740  750  760  770  780  790  800  810 

Gl'AAAAGGCX5ACXJAAGCTGTTAAAA#AGAGGCAGCAGAAAAGCrACTrGAGAAAGl'AGCATCTGATGnTTAGAGATGTATAAAGCAAlT 
ValLy.sGlyGluGlv(AlaValLysLysGluAlaAlaGlulysLev>LeuGluLysVall,roSorAspVrt].IjO».KIluMetTyi.I,ysAlalle 

820  830  840  850  860  870  880  890  900 

GGAGGAMGATATATATrGTGGATGGTGATATrAGAAAACATATATCTTTAGAAGCATTATCTGAAGATAAGAA/xAAAATAAAAGACATr 
G lyG lyl ,ys  1  leTy rl loValAspGlyAs  IleThvLysHisIloScrLeuGluAlaLeuSerGluAspLysLysL.ysIleLysAi;plie 

910  92.0  930  940  950  960  970  980  990 

TATGGGAAAGATGCl’rrATrACATCAiXCATTA'lXJTATATCGAAAAGAAGGATATGAA'JCCGTACl'rGTAATCCiWTCTTCCGAAGA'n'AT 
Tyi'GlyLysAspAlalxialietdlisGlvdlisTyrVari’yrAlal.ysGluGlyl'yrGluP  •0V'<lI>euValIleGlnSerSerGluAsp’L'yr 

1000  1010  1020  1030  1040  1050  1060  1070  i080 

CTrAGAAAATACrGAAAAGGCACIGAACGTn'ATTATGAAATAGGTAAGATAlTATCAAGGGATA’PnTA'GTAAMrrAATGAACCATAr 
Va  1G  luAsnThvGluLysAl  aleu/ts  nValTy rTy vG lul  leGlyLys  I  loLeuSorArgAspI  lc-LeuSerLysI  ]  oAsnGlnProTyr 

1090  1100  1110  1120  1130  1140  1150  1160  1170 

CAGAAAlTl’n.'AGAlGTA'n'AAATACCArrAAAAATGCATCTGAlTCAGATGGiXCAAGATClTiTATTTACrAATCAGfTn'MGGAACAT 
GlnLysMioIjOuAspValtauAsiVnirlleLysAsiiAlaSerAspScrA^pGlyGlnAspLeuLeuPheThrAsnGlnLiQuLysGluHis 
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1180  11.90  1200  1210  1220  1230  1240  1250  1260 


CC(ACAGAClTlTClGTAGAAWX!TVGGAACAAAATAGCAATCAGGTACAAGAAGlAlTrGCGAAAGGunTrGCATATTATATCGAGCCA 


pL'ollicAspVhoSei.ValGlvJ’heLGitCluClnAsnSerAsnGluValGliiGluValPheAlaLysAlaPhoAlaTj'iL'TyrlleGlul’i'o 


1270  1280  1290  1300  1310  1320  1330  1340  1350 

CAGCATOGTGATG'nTrAGAGCi'lTATGCACCGGAAGGlTTrAA'lTACATGGATAAA'nTAACGAACAAGAAATAAAl'CTATOOTlGGAA 
GlnUisArgAspVAlLouGlnLcu'l’yfAlaProGluAlaPhcsAstViyfMGCAspLysPheAsnGluGlnGluIleAsnLeuSefLBuGlu 


1360  1370  1380  1390  1400  1410  1420  1430  1440 

GMClTAAAGATCXACGGATC^TCTCAAGATATGAAAAATGGGAAAAGATAAAAGAGCACTATCMCACTGGAGOGArrCTrn'ATCTGAA 
GluLouLysAspGlnArgM^CLeuSofArg'iycGluLysTtpGluLysllcLysGlnltisl'yt'GlnHlsTrpSevAspSetLeuSetGlu 

1450  1460  1470  1480  1490  1500  1510  1520  1530 

GAAGGA/W7AGGACrrrTAAAAAAGCTGCAGATl\3CrrAlTGAGCCAAAGAAAGA'rGACATAATTCATTClTrATGTCA/\GAAGAAAAAGAG 
GluGlyAi'gGlyLeuLeaLyslysIjetKJltillePtoneGlul^oTyfiLy.sAspAspllGlloHisSerLeuSefGlnGluGUiLysGlu 

1540  1550  1560  1570  1580  1590  1600  1610  1620 

CrrCTAAAAAGAATACAAATlXJATAGTAG'rGATTTriTATCrACTCAGGAAAAAGAGrmTAAAAAAGCl'AGAAA'ITGATAlTCGTGAT 
LeuLouLysArgllcGlnllcAspScrSerAspPheLeuSerlhrGluGluLysGluPheLeuLysLysLeuGlnllcAsplleAreAsp 

1630  1640  1650  1660  1670  1680  1690  1700  1710 

TCiTTAlGTGAAGAAGAAAAAGAGGlTlTAAATAGAATACAGGTGCATAGTAGl'AATCCL'lTATCTGAAAAAGAAAAAGAGlTITrAAAA 
SorLauSetGluGluGluLysGluLeuLeuAsnArglleGlnValAspSorSerAsnProLeuSorGUiLysGluLysGluPP.eLevxLys 

1720  1730  1740  1750  1760  1770  1780  1790  1800 

AAGCTXMvV\CTlv?ATATrCAACCATATGAT«\TrAiATG4MCX7PIXJ<lV»GATACACX5AGCXTfAAri'CA,IAGTCCCn!CAA3TAATCrrGAT 
IysUnil.ysliei.iAspIlcGlnPfol'yi'AspIlcAanGliWrgljOV.iGltiAsp’llu’GlyGlyLcuIlcAspSerPt'oSorlloAsnLoviAsp 

1810  1820  1830  1840  1850  1860  1870  1880  1890 

CTMGAAAGGAGTATAAAAGWATATTCAAAATAlTGATGClTTAlTACATCMTCCATrGGAAGTACCTTGTACAATAAAATTTArTrG 
ValArgLysGlnl'ytI.ysAvgAspIleGlnAsnTleAsi)Alal>ovtLetviHirfGlnSevIleGlySe,:11u'LeuTyt’Ast\LysIleTyfIJeu 

1900  1910  1920  1930  1940  1950  1960  1970  1980 

TA’rG/1AAATAlX5AATATGAATAACCrt'ACAGCAACCCTAGGTGCGG7.TTrAGlTGArrCCACTGATAATAGTAAAArrAATACiAGGTATl' 
TytGluAsnMet'As«tIl.«AsiVtenlMiThtAlnThi:LBiiGlyAlaAspI.euValAspSerlhrAspAsiYrhrLysIlcAsnArgGlyIle 

1990  2000  2010  2020  2030  2040  2050  2060  2070 

TfCAA'IXJAA'riXIAAAAAAAA'm'CAAATATAGTATrTCTAffrAACrA'rATGATTGnGATATAAATGAAAGGCClXJCVVTrAGATAATCAG 
PlieAsi\Glu»H^eLysLysAsnPhGlys'ryrScrllcScrSorA<!nTyrMctIlcValAspIicAsnGl\.iAr£;ProAlaIx,.«AspAsnGlvi 

2080  2090  2100  2110  2120  2130  2140  2150  2160 

0CnTGAAATGGAGAATCCAA.TTATCACCAGATACrCGAGCAG(’ATA'rrrAGAA/%A'lX;(AAACXrn’ATATrACAAAC^AAACATCCX7TC'lG 
ArgLeuLysTrpAvgl  ioGlnLeviSorPfoAspThrArgA  laGlyTy  rLeuG  UiAsnG  lyLy  sLcul  leLeuGlnAvgAsnl  leGlyLeu 

2170  2180  2190  2200  2210  2220  2230  2240  2250 

GAAATAAAGGATGTACMATAArrMGCAATCCGAAAAAGAA'l’ATATAAGGATrGATGCGAAAGTAGTGCCAAAGACTAAAATAGATACA 
Glul  loLysAspValGlnllilleLysGlnSei'GluLysGluTyrlleAvglloAspAlaLysVnlVa]  Prol  .ysSerl.ysI  LoAspTht 

2260  2270  2280  2290  2300  2310  2320  2330  2340 

AAAATrCAAGAAGrACAGTlAAATATAAATCAGGAATGGMTAAAGCATrAGGGTlACCAAAATATACAAAGCTlATiACATTCAACGTG 
LyullcGluGli.iA.laGlnLGuAsi»IleA;jnGlnGluTi:pA5aLyaAlalA)uGlyI(0uP)'oIJysl'yrn^i'LysIvfiuIlc'riuThoAsnVal. 
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2350  2360  2370  2380  2390  2400  2410  2420  2430 
GATA^.TAGATATGCATCCAATATTCTAGAAAGTGCITATITMTATTGAATGAATGGAAAMTAATATTCAAAGTGATCrTATAAAAAAG 
HisAsnArgTyrAlaSerAsnIleValGl»»SerAlaTyrLeuIleLeviAsnGluTrpLysAsnAsnIleGlnSerAspbeuIleI,ysLys 

2440  2450  2460  2470  2480  2490  2500  2510  2520 
GTAACAAATTACTTAGTTGATGGrAATGGMGATTTGTITITACCGATATTACTCTCCCTAATATAGCl'GAACAATATACACATGAAGAT 
ValThrAsnTyrLeuValAspGlyAsnGlyArgPheValPheThrAspIleThrLeuProAsnlleAlaGluGliVTyrThrHisGloAsp 

2530  2540  2550  2560  2570  2580  2590  260C  2610 
GAGATATATGAGCAAGTTGATTGAAAAGGGTTATAIGTTCCAGAATCCCGTTCrrATATTACrCCATGGACCrrrCAAAAGGTGTAGAATrA 
GluIleTytGluGlnValHisSerLysGlyLeuTyrValProGluSerArgSerlleLeuLeuHisGlyProSerLysGlyValGluLeu 

2620  2630  2640  2650  2660  2670  2680  2690  2700 
AGGAATGATAGTGAGGGTTTTATACACGAATTTGGACATGCrGTGGATGATTATGCTCGATATCTATTAGATAAGAACCAATCrGATTTA 
ArgAsnAspSerGluGlyPhelleHisGluI’heGlyHisAlaValAspAspTyrAlaGlyTyrLeulauAspLysAsnGlnSerAspLeu 

2710  2720  2730  2740  2750  2760  2770  2780  2790 
GnACAAATTCnvWWVAATrCATTGATATTTniMGGAAGAAGGGAGTAATTTAACTTOGTATGGGAGAACAAATGAAGCGGAATriTrT 
ValThrAsnSerLysLysHielleAspIleFheLysGluGluGlySerAsnLeuThrSerTyrOlyArgThrAsnGluAlaGluPhePhe 

2800  2810  2820  2830  2840  2850  2860  2870  2880 
GCAGAAGCCmAGGTIAATG(ATTCTACGGACCATGCTCAAGGTrTAAAAGrrTCAAAAAAATGCTCCGAAAACrrrCCAATXTATTAAC 
AlaGluAlaPheArgLeuMetHisSerThrAspHisAlaGluArgLeuLysValGlnLysAsnAlaProLysThrPheGlnPhelleAsn 

2890  2900  2910  2920  2930  2940  2950  2960  2970 
GATCAGATTAAGTTCATTATrAACTCATAAGrrAATGXATTAAAAATTTrCAAATGGATrTAATAATAATAATAA'EAATAATAATMCGGG 
AspGlnlleLysPhel le IleAsnSer 

2980  2990  3000  3010  3020  3030  3040  3050  3060 
ACCACCCATrATCAAGCAACTAATTCTAGACTTGATAGTAATKrrTGGGAAGCACCAGATAGTGTAAAAGGTGGGATTGCCAGAATGATA 


3070  3080  3090  3100  3110  3120  3130  3140  3150 

TrTTATGTGITCGTTAGATATGAAGGCAAAAACAATGATCCTGACXnAGAACrrTAATGATAATGTTATTAATAATTTAATGCCTITTATA 

3160  3170  3180  3190  3200  3210  3220  3230  3240 

GGAATATIAGTAAAAGTGCCGAAAAGATCCriGTrGCAAAGCnTTAAAGAACATATTATrCTATCAAGTGGCTGTATATITrc’rGTAATT 

3250  3260  3270  3230  3290 

TTCAATAAATTTTGTAATTAAGGATAOGTCAAAAAACCGAAATCTGAGCTC 


Sstl 


APPENDIX  III.  LF  amino  acid  sequence 


(29  aa  signal  peptide)  1 -Start  of  mature  LF  (780  aa) 

1  MNIKKEFIKVISMSCLVTAITLSGPVFIPLVQGAGGHGDVGMHVKEKEKNKDENKRKDEERNKTQEEHLK 

71  EIMKHIVKIEVKGEEAVKKEAAEKLLEKVPSDVLEMYKAIGGKIYIVDGDITKHT.SLEALSEDKKKIKDI 

141  YGKDAL1HEHYVYAKEGYEPVLVIOSSEDYVENTEKALNVYYEIGKILSRDILSKINOPYOKFLDVLNTI 

#1 

211  KNASDSDGODLLFTNOLKEHPTDFSVEFLEONSNEVOEVFAKAFAYYIEPOHRDVLQLYAPEAFNYMDKF 
#2  #3 

281  NEQEINLSLEELK.DQRMLSRYEKWEKIKQHYQHWSDSLSEEGRGLLKKLQIPIEPKKDDIIHSLSQEEKE 

351  LLKRIQIDSSDFLSTEEKEFLKKLQIDIRDSLSEEEKELLNRIQVDSSNPLSEKEKEFLKKLKLDIQPYD 

421  iNQRLQDTGGLIDSPSINLDVRKQYKRDIQNIDALLHQSIGSTLYNKIYLYENMNINNLTATLGADLVDS 

491  TDN1L1NRG1FNEFKKNFKYSISSNYM1VDINERPALDNERIJCWRIQLSPDTRAGYLENGKLILQRNIGL 

561  EIKDVQIIKQSEKEYIRIDAKWPKSKIDTKIQEAQLNINQEWNKALGLPKYTKLITFNVHNRYASNIVE 

631  SAYLILNEWKNNIQSDLIKKVTNYLVDGNGRFVFTDITLPNIAEQYTHQDEIYEQVHSKGLYVPESRSIL 

701  LHGPSKGVELRNDSEGFIHEFGHAVDDYAGYLLDKNQSDLVTNSKKFIDIFKEEGSNLTSYGRTNEAEFF 


771  AEAFRLMHSTDHAERLKVQKNAPKTFQFINDQIKFI INS 


■ 

w 


************* 

The  sequence  contains  809  amino  acids  (Mr  93,798): 


Ala  (A) 

34 

Leu 

a) 

80 

Arg  (R) 

27 

Lys 

<K) 

86 

Asn  (N) 

54 

Met 

(M) 

10 

Asp  (D) 

55 

Phe 

<F) 

29 

Cys  (C) 

1 

Pro 

(P) 

21 

Gin  (Q) 

41 

Ser 

(S) 

54 

Glu  (E) 

79 

Thr 

(T) 

28 

Gly  <G) 

35 

Trp 

(W) 

5 

His  (H) 

21 

Tyr 

(Y) 

35 

lie  (I) 

74 

Val 

(V) 

40 

Acidic 

(Asp  +  Glu) 

134 

Basic 

(Arg  +  Lys) 

113 

Aromatic 

(P’he  +  Trp  +  Tyr) 

69 

Hydrophobic 

(Aromatic  +  lie  + 

Leu  + 

Met 

+  Val) 

273 
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APPENDIX  IV.  Homology  Comparison  between  EF  and  pertussis  cyclase. 


Calmodulin  Site  ATP  binding  Site 

—  -  —  — - ♦-  it  'Mck 

289  EKMlIIMittSKAIXASGLVPEHADAFKKIABELNIYrLFRP  '  -  AMJKSGVAaKGME^^ 

:  |  :  ||:  II  I  '1  =  1  |:  ”11  !l  M  II  lllllll  I  llllll  lllll  s  Mil  I 

1  M£SH^\GttNAAI*ESCTPAAVlIxnKAVAKEXNAJlHEm/NP^^ 

* . . . Domain  #1 - — + 


379  AVEKOJIIl^nEHBGEiaCtPL  K  UMR3^ELKENGTL]IMjXKEIDNGXKYYIIJEi2|Ny/IEFRISraNNEW55KlKBGKnVIJ 

I  II  :  I  I  II  II  I  I  I  I  I  i  I  I  II  III:  I  I 

91  EVHARAIMWNSSTAHGHIAVDLTISKERIlIilRQAGL  VUG  M&DG  WASMlAGyEQfE  FKVKE  TSDGRBWQffiRK  G 

♦-Domain  #2-» 

466  GH<niH^E\MAKNVECVLKPLMDYDLFALAP  SUEIKKqnTKRMEKV  VOT  PNSLEM^VMIJ;  KYGIER  KPDST 

|:  I  :  I  |:  I  lllll  IMIM  I  II  I  I  |:  I  :  I  I  I 

168  GDDF  EAWV  ICNAAG  IIITADIIlffAlMEWI^NniDSARSSVIS(31SVTDYIAKIERAASEAP3GLDREIlIDIIlKIARAClAESA 

<— - Domain  #3 - — ► 

546  KGTLStWQ  PCM  LDRLNE  AVKYKJIfTQG  DVVltl71H)DNE^TCXDNEIFIINFEG£  FI13KNWEMTCRF1H<NIT 

II  I  I  III:  I  IIIMIIIIM  III  I :  II”  II  Ml:  M  : 

253  VCTlEABRI^RYDGEMNIGVrilX^ElJEVRNAlNRKAHAiAlAQDVVQHGIBCyM  PFPEADQQFVVSAIGE9C?OBGQ  IKEYIGQQ  R 

621  QOM^YFNRSWIAPGNKAnEWroP  TIKAmTIPISAIilTONISSIRRSaWGVWDSCI*^ 

I  l”l  IIM  I  I  I  :  Ml  I  I  ill  I  I  I  Ml  I 

339  GEGWFYH^RAVGVACKSLFDDGLGAAPGVPSCSSKFSHWLEIVPASEGLRRPSLGAVEEQDSG  1DSLDGVGSRSFS1GEVSD  MAA 


709 


426 


!HIF9QEKKRKISIFI(GIQAjNEI£NVIXSW[JIAPE1!KNYR3yiXERITN^lIJinCJCSNIEFKlIJa3}INFTENElINfWRjaiDEX 

:  I  I  :  lllll  M  I  |:  I  I  II 

VEAAE10fIiqVIHAGAKy*AE  PGV  SGASAtWQQRAIQ  GAQAVAAAyD7HAIAI>nQI1GRAGSlIHI\}EAASLSAAVR5LGEAS3 


1 .  Domains  #1 ,  #2  and  #3  represent  three  highly  conserved  amino  acid 
domains  in  EF  (top  line  of  each  pair)  and  the  pertussis  cyclase  (bottom 
line  in  each  pair) . 

2.  The  numbers  to  the  left  of  each  line  indicates  the  amino  acid  position 
for  EF-precursor  or  the  pertussis  cyclase. 

3.  The  asterisks  (*)  indicate  the  consensus  sequences  for  the  ATP  binding 
site  for  EF  and  the  pertussis  cyclase. 
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APPENDIX  V:  Restriction  enzyme  cleavage  sites  for  EF  gene. 


cya  gene  boundary  (544-2943) 


AFL  3  - 

AHA  3  - 

ALU  1  - 

APA  1  - 

AVA  1  - 

AVA  2  - 

BAN  2  - 

BBV  1  - 

BCL  1  - 

BGI.  2  - 

BIN  1  - 

BSP  128 
BSP  Ml 
BST  HI 
DDE  1 
ECO  OlO 
ECO  R1 
FNU  4H1 
FOR  1 
HAE  1 
HAE  3 
HGI  J2 
HHA  1 
HINC  2 
HIND  3 
HINF  1 
HPA  1 
HPA  2 
HPH  1 
MBO  2 
MNL  1 
MST  2 
NCO  1 
NDE  1 
NLA  3 
NLA  4 
NSP  B2 
NSP  Cl 
PPU  Ml 
PVU  2 
RSA  1 
SAU  1 


lll-l 

II--- 


l-lli 


27 


cya  gene  boundary  (544-2943) 


0 


500  1000  1500  2000  2500  3000 

-I . I~ . I . I . I . --I- 


SAU  3A 
SAU  96 
SCR  FI 
SOU  1 
SFA  N1 
SPH  1 
SSP  1 
STU  1 
STY  1 
TAQ  1 
XBA  1 
XHO  1 
XHO  2 
XMN  1 


I  -  I 


I- 


. I . HI 

I . - . - . - . 


. I---I 

I---- . 
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APPENDIX  VI:  Restriction  enzyme  cleavage  site  for  PA  gene 


0  500  1000 


1500 


pag  gene  boundary  (1804-4095) 

1 2000  2500  3000  3500  4000 
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1  - 
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1  1 
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2  - 

AVA 
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AVA 
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AVA 

3  - 

BAM 

HI  - 

BAN 

1  - 

BBV 

1  - 

BCI. 

1  - 

BIN 

1  - 

BSP 
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BSP 

M2  - 

BST 

E2  - 

BST 

N1  - 

DDE 

1  - 

ECO 

R1  - 

FNU 

4H1- 

FNU 

D2  - 

FOK 

1  - 

HAE 

2  - 

HGA 

1  - 

HGI 

Al  - 

HGI 

Cl  - 

HHA 

1  - 

HINC  2  - 

HIND  3  | 

HINF  1  - 

HPA 

1  - 

HPA 

2  - 

HPH 

1  - 

MBO 

2  - 

MNL 

1  - 

NAR 
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NCO 

1  - 

NLA 
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NLA 
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NRU 

1 

I 


. I . I- 

. 

1-1 . . 
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I 
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png  gene  boundary  (1804-4095) 


0  500  1000  1500  *2000  2500  3000  3500  4000 
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FI 
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1 
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Nl 
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1 
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1 
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1 
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APPENDIX  VII: 


Restriction  enzyme  cleavage  site  for  LF  gene 


lef  gene  boundary  (481-2916) 


0  500  1000  1500  2000  2500  3000 

I - I . I . I . I . I . I- 


AFL 
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3 

MX 

AVA 
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AVA 
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3 
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1 
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BSP 

128 

BST 
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DDE 

1 
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R1 

ECO 

R5 
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1 

HAE 

1 

HAE 
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HGI 

A1 

HGI 

J2 
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HINF  1 

HPA 

2 

HPH 

1 
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2 
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1 
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1 

NDE 

1 
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3 
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4 
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1 
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Cl 

PST 

1 
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1 

SAC 

1 

SAU 

3A 

SAU 

96 
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1 

SFA 

N1 

SSP 

l 
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